Overview
Dataset statistics
| Number of variables | 21 |
|---|---|
| Number of observations | 998 |
| Missing cells | 81 |
| Missing cells (%) | 0.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 565.6 KiB |
| Average record size in memory | 580.3 B |
Variable types
| Text | 2 |
|---|---|
| Categorical | 9 |
| DateTime | 1 |
| Numeric | 8 |
| Boolean | 1 |
Dataset
| Description | JHB_DPHRU_053 - Quality-corrected harmonized data |
|---|---|
| Creator | RP2 Clinical Data Quality Team |
| Author | Quality-Checked Data |
| URL | HEAT Research Projects |
Variable descriptions
| Age (at enrolment) | Patient age at study enrollment |
|---|---|
| CD4 cell count (cells/µL) | CD4+ T lymphocyte count (missing codes removed) |
| HIV viral load (copies/mL) | HIV RNA copies per mL (missing codes removed) |
| BMI (kg/m²) | Body Mass Index (extreme values removed) |
| Waist circumference (cm) | Waist circumference (corrected from mm to cm) |
| ALT (U/L) | Alanine aminotransferase (missing codes removed) |
| Platelet count (×10³/µL) | Platelet count (missing codes removed) |
| Hematocrit (%) | Hematocrit (zero values removed) |
| Lymphocyte count (×10⁹/L) | Lymphocyte absolute count (corrected labeling) |
| Neutrophil count (×10⁹/L) | Neutrophil absolute count (corrected labeling) |
| cd4_correction_applied | Quality flag: CD4 missing codes removed |
| final_comprehensive_fix_applied | Quality flag: Comprehensive corrections applied |
| waist_circ_unit_correction_applied | Quality flag: Waist circ unit corrected |
study_source has constant value "JHB_DPHRU_053" | Constant |
latitude has constant value "-26.2041" | Constant |
longitude has constant value "28.0473" | Constant |
province has constant value "Gauteng" | Constant |
city has constant value "Johannesburg" | Constant |
jhb_subregion has constant value "Central_JHB" | Constant |
cd4_correction_applied has constant value "0.0" | Constant |
final_comprehensive_fix_applied has constant value "1.0" | Constant |
waist_circ_unit_correction_applied has constant value "False" | Constant |
BMI (kg/m²) is highly overall correlated with Sex and 1 other fields | High correlation |
Sex is highly overall correlated with BMI (kg/m²) and 1 other fields | High correlation |
diastolic_bp_mmHg is highly overall correlated with systolic_bp_mmHg | High correlation |
height_m is highly overall correlated with Sex | High correlation |
systolic_bp_mmHg is highly overall correlated with diastolic_bp_mmHg | High correlation |
weight_kg is highly overall correlated with BMI (kg/m²) | High correlation |
total_cholesterol_mg_dL has 26 (2.6%) missing values | Missing |
Triglycerides (mg/dL) has 26 (2.6%) missing values | Missing |
anonymous_patient_id has unique values | Unique |
Patient ID has unique values | Unique |
Reproduction
| Analysis started | 2025-11-24 21:49:13.964011 |
|---|---|
| Analysis finished | 2025-11-24 21:49:16.702402 |
| Duration | 2.74 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
anonymous_patient_id
Text
Unique
| Distinct | 998 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 72.1 KiB |
Length
| Max length | 17 |
|---|---|
| Median length | 17 |
| Mean length | 17 |
| Min length | 17 |
Unique
| Unique | 998 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | HEAT_EFCE0743072E |
|---|---|
| 2nd row | HEAT_F3BA2B285DB1 |
| 3rd row | HEAT_2B8BDBC0C1EE |
| 4th row | HEAT_E3EC25AD8189 |
| 5th row | HEAT_17FEBF78F855 |
| Value | Count | Frequency (%) |
| heat_efce0743072e | 1 | 0.1% |
| heat_a402e1bca908 | 1 | 0.1% |
| heat_2ec5321500fb | 1 | 0.1% |
| heat_b216e9aa2d15 | 1 | 0.1% |
| heat_2b8bdbc0c1ee | 1 | 0.1% |
| heat_e3ec25ad8189 | 1 | 0.1% |
| heat_17febf78f855 | 1 | 0.1% |
| heat_5fe7a2fc6a9c | 1 | 0.1% |
| heat_0058eddd14fa | 1 | 0.1% |
| heat_a686658cd4f5 | 1 | 0.1% |
| Other values (988) | 988 |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 1796 | 10.6% |
| A | 1749 | 10.3% |
| H | 998 | 5.9% |
| T | 998 | 5.9% |
| _ | 998 | 5.9% |
| 8 | 785 | 4.6% |
| 4 | 783 | 4.6% |
| 3 | 777 | 4.6% |
| 2 | 766 | 4.5% |
| 0 | 765 | 4.5% |
| Other values (9) | 6551 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 8454 | |
| Decimal Number | 7514 | |
| Connector Punctuation | 998 | 5.9% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 8 | 785 | |
| 4 | 783 | |
| 3 | 777 | |
| 2 | 766 | |
| 0 | 765 | |
| 9 | 765 | |
| 5 | 739 | |
| 6 | 723 | |
| 1 | 721 | |
| 7 | 690 |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 1796 | |
| A | 1749 | |
| H | 998 | |
| T | 998 | |
| D | 746 | |
| C | 735 | |
| F | 732 | |
| B | 700 | 8.3% |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8512 | |
| Latin | 8454 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| _ | 998 | |
| 8 | 785 | |
| 4 | 783 | |
| 3 | 777 | |
| 2 | 766 | |
| 0 | 765 | |
| 9 | 765 | |
| 5 | 739 | |
| 6 | 723 | |
| 1 | 721 |
Latin
| Value | Count | Frequency (%) |
| E | 1796 | |
| A | 1749 | |
| H | 998 | |
| T | 998 | |
| D | 746 | |
| C | 735 | |
| F | 732 | |
| B | 700 | 8.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16966 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| E | 1796 | 10.6% |
| A | 1749 | 10.3% |
| H | 998 | 5.9% |
| T | 998 | 5.9% |
| _ | 998 | 5.9% |
| 8 | 785 | 4.6% |
| 4 | 783 | 4.6% |
| 3 | 777 | 4.6% |
| 2 | 766 | 4.5% |
| 0 | 765 | 4.5% |
| Other values (9) | 6551 |
Patient ID
Text
Unique
| Distinct | 998 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.7 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 8 |
| Mean length | 8.3396794 |
| Min length | 7 |
Unique
| Unique | 998 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | GSK1001 |
|---|---|
| 2nd row | GSK1003 |
| 3rd row | GSK1004 |
| 4th row | GSK1006 |
| 5th row | GSK1007 |
| Value | Count | Frequency (%) |
| gsk1001 | 1 | 0.1% |
| gsk1019 | 1 | 0.1% |
| gsk1045 | 1 | 0.1% |
| gsk1042 | 1 | 0.1% |
| gsk1004 | 1 | 0.1% |
| gsk1006 | 1 | 0.1% |
| gsk1007 | 1 | 0.1% |
| gsk1008 | 1 | 0.1% |
| gsk1010 | 1 | 0.1% |
| gsk1011 | 1 | 0.1% |
| Other values (988) | 988 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 1122 | |
| G | 998 | |
| S | 998 | |
| K | 998 | |
| 0 | 592 | |
| 3 | 525 | |
| 2 | 505 | |
| 4 | 498 | |
| 5 | 476 | 5.7% |
| 6 | 406 | 4.9% |
| Other values (3) | 1205 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5329 | |
| Uppercase Letter | 2994 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1122 | |
| 0 | 592 | |
| 3 | 525 | |
| 2 | 505 | |
| 4 | 498 | |
| 5 | 476 | |
| 6 | 406 | 7.6% |
| 9 | 405 | 7.6% |
| 7 | 404 | 7.6% |
| 8 | 396 | 7.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 998 | |
| S | 998 | |
| K | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 5329 | |
| Latin | 2994 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 1122 | |
| 0 | 592 | |
| 3 | 525 | |
| 2 | 505 | |
| 4 | 498 | |
| 5 | 476 | |
| 6 | 406 | 7.6% |
| 9 | 405 | 7.6% |
| 7 | 404 | 7.6% |
| 8 | 396 | 7.4% |
Latin
| Value | Count | Frequency (%) |
| G | 998 | |
| S | 998 | |
| K | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8323 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 1122 | |
| G | 998 | |
| S | 998 | |
| K | 998 | |
| 0 | 592 | |
| 3 | 525 | |
| 2 | 505 | |
| 4 | 498 | |
| 5 | 476 | 5.7% |
| 6 | 406 | 4.9% |
| Other values (3) | 1205 |
study_source
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 68.2 KiB |
| JHB_DPHRU_053 |
|---|
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 13 |
| Min length | 13 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | JHB_DPHRU_053 |
|---|---|
| 2nd row | JHB_DPHRU_053 |
| 3rd row | JHB_DPHRU_053 |
| 4th row | JHB_DPHRU_053 |
| 5th row | JHB_DPHRU_053 |
Common Values
| Value | Count | Frequency (%) |
| JHB_DPHRU_053 | 998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| jhb_dphru_053 | 998 |
Most occurring characters
| Value | Count | Frequency (%) |
| H | 1996 | |
| _ | 1996 | |
| J | 998 | |
| B | 998 | |
| D | 998 | |
| P | 998 | |
| R | 998 | |
| U | 998 | |
| 0 | 998 | |
| 5 | 998 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 7984 | |
| Decimal Number | 2994 | 23.1% |
| Connector Punctuation | 1996 | 15.4% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 1996 | |
| J | 998 | |
| B | 998 | |
| D | 998 | |
| P | 998 | |
| R | 998 | |
| U | 998 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 998 | |
| 5 | 998 | |
| 3 | 998 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1996 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7984 | |
| Common | 4990 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| H | 1996 | |
| J | 998 | |
| B | 998 | |
| D | 998 | |
| P | 998 | |
| R | 998 | |
| U | 998 |
Common
| Value | Count | Frequency (%) |
| _ | 1996 | |
| 0 | 998 | |
| 5 | 998 | |
| 3 | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12974 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| H | 1996 | |
| _ | 1996 | |
| J | 998 | |
| B | 998 | |
| D | 998 | |
| P | 998 | |
| R | 998 | |
| U | 998 | |
| 0 | 998 | |
| 5 | 998 |
primary_date
Date
| Distinct | 301 |
|---|---|
| Distinct (%) | 30.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 KiB |
| Minimum | 2017-01-23 00:00:00 |
|---|---|
| Maximum | 2018-07-24 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
Age (at enrolment)
Real number (ℝ)
Patient age at study enrollment
| Distinct | 29 |
|---|---|
| Distinct (%) | 2.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.599198 |
| Minimum | 41 |
|---|---|
| Maximum | 71 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 41 |
|---|---|
| 5-th percentile | 44 |
| Q1 | 49 |
| median | 53 |
| Q3 | 59 |
| 95-th percentile | 63 |
| Maximum | 71 |
| Range | 30 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 5.9694709 |
|---|---|
| Coefficient of variation (CV) | 0.11137239 |
| Kurtosis | -0.97067221 |
| Mean | 53.599198 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 0.064769631 |
| Sum | 53492 |
| Variance | 35.634583 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 53 | 63 | 6.3% |
| 55 | 61 | 6.1% |
| 48 | 57 | 5.7% |
| 62 | 56 | 5.6% |
| 49 | 55 | 5.5% |
| 50 | 52 | 5.2% |
| 52 | 52 | 5.2% |
| 47 | 50 | 5.0% |
| 54 | 46 | 4.6% |
| 60 | 46 | 4.6% |
| Other values (19) | 460 |
| Value | Count | Frequency (%) |
| 41 | 4 | 0.4% |
| 42 | 4 | 0.4% |
| 43 | 18 | 1.8% |
| 44 | 30 | |
| 45 | 39 | |
| 46 | 39 | |
| 47 | 50 | |
| 48 | 57 | |
| 49 | 55 | |
| 50 | 52 |
| Value | Count | Frequency (%) |
| 71 | 1 | 0.1% |
| 68 | 1 | 0.1% |
| 67 | 1 | 0.1% |
| 66 | 3 | 0.3% |
| 65 | 8 | 0.8% |
| 64 | 16 | 1.6% |
| 63 | 36 | |
| 62 | 56 | |
| 61 | 40 | |
| 60 | 46 |
Sex
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 60.4 KiB |
| Male | |
|---|---|
| Female |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.995992 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Female |
|---|---|
| 2nd row | Female |
| 3rd row | Female |
| 4th row | Female |
| 5th row | Female |
Common Values
| Value | Count | Frequency (%) |
| Male | 501 | |
| Female | 497 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| male | 501 | |
| female | 497 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1495 | |
| a | 998 | |
| l | 998 | |
| M | 501 | 10.0% |
| F | 497 | 10.0% |
| m | 497 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3988 | |
| Uppercase Letter | 998 | 20.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1495 | |
| a | 998 | |
| l | 998 | |
| m | 497 | 12.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 501 | |
| F | 497 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4986 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1495 | |
| a | 998 | |
| l | 998 | |
| M | 501 | 10.0% |
| F | 497 | 10.0% |
| m | 497 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4986 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1495 | |
| a | 998 | |
| l | 998 | |
| M | 501 | 10.0% |
| F | 497 | 10.0% |
| m | 497 | 10.0% |
latitude
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 63.3 KiB |
| -26.2041 |
|---|
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | -26.2041 |
|---|---|
| 2nd row | -26.2041 |
| 3rd row | -26.2041 |
| 4th row | -26.2041 |
| 5th row | -26.2041 |
Common Values
| Value | Count | Frequency (%) |
| -26.2041 | 998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 26.2041 | 998 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1996 | |
| - | 998 | |
| 6 | 998 | |
| . | 998 | |
| 0 | 998 | |
| 4 | 998 | |
| 1 | 998 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5988 | |
| Dash Punctuation | 998 | 12.5% |
| Other Punctuation | 998 | 12.5% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1996 | |
| 6 | 998 | |
| 0 | 998 | |
| 4 | 998 | |
| 1 | 998 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 998 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 7984 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1996 | |
| - | 998 | |
| 6 | 998 | |
| . | 998 | |
| 0 | 998 | |
| 4 | 998 | |
| 1 | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7984 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1996 | |
| - | 998 | |
| 6 | 998 | |
| . | 998 | |
| 0 | 998 | |
| 4 | 998 | |
| 1 | 998 |
longitude
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 62.4 KiB |
| 28.0473 |
|---|
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 28.0473 |
|---|---|
| 2nd row | 28.0473 |
| 3rd row | 28.0473 |
| 4th row | 28.0473 |
| 5th row | 28.0473 |
Common Values
| Value | Count | Frequency (%) |
| 28.0473 | 998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 28.0473 | 998 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 998 | |
| 8 | 998 | |
| . | 998 | |
| 0 | 998 | |
| 4 | 998 | |
| 7 | 998 | |
| 3 | 998 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5988 | |
| Other Punctuation | 998 | 14.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 998 | |
| 8 | 998 | |
| 0 | 998 | |
| 4 | 998 | |
| 7 | 998 | |
| 3 | 998 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6986 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 998 | |
| 8 | 998 | |
| . | 998 | |
| 0 | 998 | |
| 4 | 998 | |
| 7 | 998 | |
| 3 | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6986 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 998 | |
| 8 | 998 | |
| . | 998 | |
| 0 | 998 | |
| 4 | 998 | |
| 7 | 998 | |
| 3 | 998 |
province
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 62.4 KiB |
| Gauteng |
|---|
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Gauteng |
|---|---|
| 2nd row | Gauteng |
| 3rd row | Gauteng |
| 4th row | Gauteng |
| 5th row | Gauteng |
Common Values
| Value | Count | Frequency (%) |
| Gauteng | 998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| gauteng | 998 |
Most occurring characters
| Value | Count | Frequency (%) |
| G | 998 | |
| a | 998 | |
| u | 998 | |
| t | 998 | |
| e | 998 | |
| n | 998 | |
| g | 998 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5988 | |
| Uppercase Letter | 998 | 14.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 998 | |
| u | 998 | |
| t | 998 | |
| e | 998 | |
| n | 998 | |
| g | 998 |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6986 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 998 | |
| a | 998 | |
| u | 998 | |
| t | 998 | |
| e | 998 | |
| n | 998 | |
| g | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6986 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| G | 998 | |
| a | 998 | |
| u | 998 | |
| t | 998 | |
| e | 998 | |
| n | 998 | |
| g | 998 |
city
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.2 KiB |
| Johannesburg |
|---|
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 12 |
| Min length | 12 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Johannesburg |
|---|---|
| 2nd row | Johannesburg |
| 3rd row | Johannesburg |
| 4th row | Johannesburg |
| 5th row | Johannesburg |
Common Values
| Value | Count | Frequency (%) |
| Johannesburg | 998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| johannesburg | 998 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1996 | |
| J | 998 | |
| o | 998 | |
| h | 998 | |
| a | 998 | |
| e | 998 | |
| s | 998 | |
| b | 998 | |
| u | 998 | |
| r | 998 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10978 | |
| Uppercase Letter | 998 | 8.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 1996 | |
| o | 998 | |
| h | 998 | |
| a | 998 | |
| e | 998 | |
| s | 998 | |
| b | 998 | |
| u | 998 | |
| r | 998 | |
| g | 998 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11976 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 1996 | |
| J | 998 | |
| o | 998 | |
| h | 998 | |
| a | 998 | |
| e | 998 | |
| s | 998 | |
| b | 998 | |
| u | 998 | |
| r | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11976 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 1996 | |
| J | 998 | |
| o | 998 | |
| h | 998 | |
| a | 998 | |
| e | 998 | |
| s | 998 | |
| b | 998 | |
| u | 998 | |
| r | 998 |
jhb_subregion
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 66.3 KiB |
| Central_JHB |
|---|
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 11 |
| Min length | 11 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Central_JHB |
|---|---|
| 2nd row | Central_JHB |
| 3rd row | Central_JHB |
| 4th row | Central_JHB |
| 5th row | Central_JHB |
Common Values
| Value | Count | Frequency (%) |
| Central_JHB | 998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| central_jhb | 998 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 998 | |
| e | 998 | |
| n | 998 | |
| t | 998 | |
| r | 998 | |
| a | 998 | |
| l | 998 | |
| _ | 998 | |
| J | 998 | |
| H | 998 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5988 | |
| Uppercase Letter | 3992 | |
| Connector Punctuation | 998 | 9.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 998 | |
| n | 998 | |
| t | 998 | |
| r | 998 | |
| a | 998 | |
| l | 998 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 998 | |
| J | 998 | |
| H | 998 | |
| B | 998 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9980 | |
| Common | 998 | 9.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| C | 998 | |
| e | 998 | |
| n | 998 | |
| t | 998 | |
| r | 998 | |
| a | 998 | |
| l | 998 | |
| J | 998 | |
| H | 998 | |
| B | 998 |
Common
| Value | Count | Frequency (%) |
| _ | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10978 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| C | 998 | |
| e | 998 | |
| n | 998 | |
| t | 998 | |
| r | 998 | |
| a | 998 | |
| l | 998 | |
| _ | 998 | |
| J | 998 | |
| H | 998 |
| Distinct | 816 |
|---|---|
| Distinct (%) | 82.2% |
| Missing | 5 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29.560846 |
| Minimum | 15.24 |
|---|---|
| Maximum | 65.89 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 15.24 |
|---|---|
| 5-th percentile | 18.286 |
| Q1 | 23.53 |
| median | 29.07 |
| Q3 | 34.52 |
| 95-th percentile | 42.696 |
| Maximum | 65.89 |
| Range | 50.65 |
| Interquartile range (IQR) | 10.99 |
Descriptive statistics
| Standard deviation | 7.7932914 |
|---|---|
| Coefficient of variation (CV) | 0.2636356 |
| Kurtosis | 0.67413325 |
| Mean | 29.560846 |
| Median Absolute Deviation (MAD) | 5.49 |
| Skewness | 0.63362711 |
| Sum | 29353.92 |
| Variance | 60.735391 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 32.47 | 3 | 0.3% |
| 34.76 | 3 | 0.3% |
| 31.26 | 3 | 0.3% |
| 32.57 | 3 | 0.3% |
| 30.94 | 3 | 0.3% |
| 26.96 | 3 | 0.3% |
| 28.55 | 3 | 0.3% |
| 27.85 | 3 | 0.3% |
| 30.5 | 3 | 0.3% |
| 37.2 | 3 | 0.3% |
| Other values (806) | 963 | |
| (Missing) | 5 | 0.5% |
| Value | Count | Frequency (%) |
| 15.24 | 1 | |
| 15.3 | 1 | |
| 15.39 | 1 | |
| 15.46 | 1 | |
| 15.69 | 1 | |
| 15.85 | 1 | |
| 15.93 | 1 | |
| 16.14 | 1 | |
| 16.21 | 1 | |
| 16.34 | 1 |
| Value | Count | Frequency (%) |
| 65.89 | 1 | |
| 62.84 | 1 | |
| 58.62 | 1 | |
| 56.42 | 1 | |
| 56.09 | 1 | |
| 53.91 | 1 | |
| 53.85 | 1 | |
| 53.13 | 1 | |
| 52.89 | 1 | |
| 51.37 | 1 |
weight_kg
Real number (ℝ)
High correlation
| Distinct | 529 |
|---|---|
| Distinct (%) | 53.3% |
| Missing | 5 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 79.47718 |
| Minimum | 37 |
|---|---|
| Maximum | 168.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 37 |
|---|---|
| 5-th percentile | 51.9 |
| Q1 | 65.2 |
| median | 78.1 |
| Q3 | 90.9 |
| 95-th percentile | 113 |
| Maximum | 168.8 |
| Range | 131.8 |
| Interquartile range (IQR) | 25.7 |
Descriptive statistics
| Standard deviation | 19.199436 |
|---|---|
| Coefficient of variation (CV) | 0.24157168 |
| Kurtosis | 0.76277468 |
| Mean | 79.47718 |
| Median Absolute Deviation (MAD) | 12.9 |
| Skewness | 0.63115531 |
| Sum | 78920.84 |
| Variance | 368.61833 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 78.8 | 7 | 0.7% |
| 77.2 | 6 | 0.6% |
| 74.2 | 6 | 0.6% |
| 66.6 | 6 | 0.6% |
| 63.9 | 5 | 0.5% |
| 85.3 | 5 | 0.5% |
| 90.6 | 5 | 0.5% |
| 80.7 | 5 | 0.5% |
| 74 | 5 | 0.5% |
| 60.5 | 5 | 0.5% |
| Other values (519) | 938 |
| Value | Count | Frequency (%) |
| 37 | 1 | |
| 38.3 | 1 | |
| 39.7 | 1 | |
| 40.7 | 1 | |
| 40.9 | 1 | |
| 42.2 | 1 | |
| 42.4 | 1 | |
| 42.9 | 1 | |
| 43.5 | 1 | |
| 43.8 | 1 |
| Value | Count | Frequency (%) |
| 168.8 | 1 | |
| 162.2 | 1 | |
| 153.9 | 1 | |
| 144.5 | 1 | |
| 143 | 1 | |
| 136.8 | 1 | |
| 134.8 | 1 | |
| 134.7 | 1 | |
| 133.6 | 1 | |
| 133.4 | 1 |
height_m
Real number (ℝ)
High correlation
| Distinct | 50 |
|---|---|
| Distinct (%) | 5.0% |
| Missing | 5 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.646858 |
| Minimum | 1.39 |
|---|---|
| Maximum | 1.92 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 1.39 |
|---|---|
| 5-th percentile | 1.5 |
| Q1 | 1.58 |
| median | 1.64 |
| Q3 | 1.71 |
| 95-th percentile | 1.79 |
| Maximum | 1.92 |
| Range | 0.53 |
| Interquartile range (IQR) | 0.13 |
Descriptive statistics
| Standard deviation | 0.090536021 |
|---|---|
| Coefficient of variation (CV) | 0.054975001 |
| Kurtosis | -0.40941396 |
| Mean | 1.646858 |
| Median Absolute Deviation (MAD) | 0.07 |
| Skewness | 0.11788112 |
| Sum | 1635.33 |
| Variance | 0.0081967711 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.58 | 53 | 5.3% |
| 1.63 | 42 | 4.2% |
| 1.74 | 42 | 4.2% |
| 1.62 | 39 | 3.9% |
| 1.7 | 38 | 3.8% |
| 1.59 | 38 | 3.8% |
| 1.69 | 38 | 3.8% |
| 1.64 | 38 | 3.8% |
| 1.67 | 37 | 3.7% |
| 1.57 | 37 | 3.7% |
| Other values (40) | 591 |
| Value | Count | Frequency (%) |
| 1.39 | 1 | 0.1% |
| 1.44 | 3 | 0.3% |
| 1.45 | 6 | 0.6% |
| 1.46 | 9 | |
| 1.47 | 5 | 0.5% |
| 1.48 | 9 | |
| 1.49 | 7 | |
| 1.5 | 11 | |
| 1.51 | 6 | 0.6% |
| 1.52 | 16 |
| Value | Count | Frequency (%) |
| 1.92 | 1 | 0.1% |
| 1.91 | 2 | 0.2% |
| 1.9 | 1 | 0.1% |
| 1.89 | 1 | 0.1% |
| 1.88 | 1 | 0.1% |
| 1.87 | 3 | 0.3% |
| 1.86 | 1 | 0.1% |
| 1.85 | 3 | 0.3% |
| 1.84 | 2 | 0.2% |
| 1.83 | 10 |
systolic_bp_mmHg
Real number (ℝ)
High correlation
| Distinct | 196 |
|---|---|
| Distinct (%) | 19.8% |
| Missing | 7 |
| Missing (%) | 0.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 132.32341 |
| Minimum | 81 |
|---|---|
| Maximum | 258.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 81 |
|---|---|
| 5-th percentile | 103 |
| Q1 | 117.75 |
| median | 130 |
| Q3 | 144 |
| 95-th percentile | 169.75 |
| Maximum | 258.5 |
| Range | 177.5 |
| Interquartile range (IQR) | 26.25 |
Descriptive statistics
| Standard deviation | 21.355436 |
|---|---|
| Coefficient of variation (CV) | 0.16138819 |
| Kurtosis | 2.2537289 |
| Mean | 132.32341 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 0.92951207 |
| Sum | 131132.5 |
| Variance | 456.05464 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 123 | 16 | 1.6% |
| 124.5 | 15 | 1.5% |
| 117.5 | 15 | 1.5% |
| 136.5 | 15 | 1.5% |
| 119 | 15 | 1.5% |
| 118.5 | 14 | 1.4% |
| 126.5 | 14 | 1.4% |
| 124 | 14 | 1.4% |
| 142.5 | 13 | 1.3% |
| 134.5 | 13 | 1.3% |
| Other values (186) | 847 |
| Value | Count | Frequency (%) |
| 81 | 1 | |
| 85 | 1 | |
| 86.5 | 1 | |
| 87 | 1 | |
| 87.5 | 1 | |
| 88 | 1 | |
| 88.5 | 1 | |
| 89 | 1 | |
| 89.5 | 1 | |
| 90 | 2 |
| Value | Count | Frequency (%) |
| 258.5 | 1 | |
| 239 | 1 | |
| 215 | 1 | |
| 213 | 1 | |
| 211 | 1 | |
| 208.5 | 1 | |
| 207.5 | 1 | |
| 206.5 | 1 | |
| 193.5 | 1 | |
| 192 | 1 |
diastolic_bp_mmHg
Real number (ℝ)
High correlation
| Distinct | 133 |
|---|---|
| Distinct (%) | 13.4% |
| Missing | 7 |
| Missing (%) | 0.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 88.397074 |
| Minimum | 48.5 |
|---|---|
| Maximum | 150 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 48.5 |
|---|---|
| 5-th percentile | 70 |
| Q1 | 80 |
| median | 87.5 |
| Q3 | 96 |
| 95-th percentile | 109.5 |
| Maximum | 150 |
| Range | 101.5 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 12.449453 |
|---|---|
| Coefficient of variation (CV) | 0.14083558 |
| Kurtosis | 1.1474371 |
| Mean | 88.397074 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.55858342 |
| Sum | 87601.5 |
| Variance | 154.98889 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 79.5 | 23 | 2.3% |
| 86 | 23 | 2.3% |
| 87.5 | 22 | 2.2% |
| 96 | 22 | 2.2% |
| 83.5 | 21 | 2.1% |
| 84 | 21 | 2.1% |
| 83 | 20 | 2.0% |
| 90.5 | 20 | 2.0% |
| 90 | 20 | 2.0% |
| 86.5 | 20 | 2.0% |
| Other values (123) | 779 |
| Value | Count | Frequency (%) |
| 48.5 | 1 | |
| 56 | 1 | |
| 57 | 1 | |
| 58.5 | 1 | |
| 60 | 1 | |
| 60.5 | 1 | |
| 61 | 2 | |
| 61.5 | 1 | |
| 62.5 | 1 | |
| 63 | 2 |
| Value | Count | Frequency (%) |
| 150 | 1 | |
| 141.5 | 1 | |
| 133.5 | 1 | |
| 131.5 | 1 | |
| 131 | 1 | |
| 130 | 1 | |
| 129.5 | 1 | |
| 125.5 | 2 | |
| 125 | 1 | |
| 122.5 | 2 |
total_cholesterol_mg_dL
Real number (ℝ)
Missing
| Distinct | 369 |
|---|---|
| Distinct (%) | 38.0% |
| Missing | 26 |
| Missing (%) | 2.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.3946914 |
| Minimum | 1.6 |
|---|---|
| Maximum | 11.98 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 1.6 |
|---|---|
| 5-th percentile | 2.8555 |
| Q1 | 3.68 |
| median | 4.35 |
| Q3 | 5.02 |
| 95-th percentile | 6.1745 |
| Maximum | 11.98 |
| Range | 10.38 |
| Interquartile range (IQR) | 1.34 |
Descriptive statistics
| Standard deviation | 1.0298543 |
|---|---|
| Coefficient of variation (CV) | 0.23434052 |
| Kurtosis | 2.8989069 |
| Mean | 4.3946914 |
| Median Absolute Deviation (MAD) | 0.67 |
| Skewness | 0.72446762 |
| Sum | 4271.64 |
| Variance | 1.0605998 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.11 | 11 | 1.1% |
| 4.04 | 10 | 1.0% |
| 4.75 | 9 | 0.9% |
| 3.75 | 8 | 0.8% |
| 4.45 | 8 | 0.8% |
| 5.42 | 7 | 0.7% |
| 3.55 | 7 | 0.7% |
| 4.37 | 7 | 0.7% |
| 4.09 | 7 | 0.7% |
| 4.08 | 7 | 0.7% |
| Other values (359) | 891 | |
| (Missing) | 26 | 2.6% |
| Value | Count | Frequency (%) |
| 1.6 | 1 | |
| 1.8 | 1 | |
| 1.82 | 1 | |
| 2 | 1 | |
| 2.15 | 1 | |
| 2.18 | 1 | |
| 2.2 | 1 | |
| 2.28 | 1 | |
| 2.33 | 1 | |
| 2.35 | 1 |
| Value | Count | Frequency (%) |
| 11.98 | 1 | |
| 7.82 | 1 | |
| 7.71 | 1 | |
| 7.6 | 1 | |
| 7.43 | 1 | |
| 7.3 | 1 | |
| 7.24 | 1 | |
| 7.2 | 1 | |
| 7.09 | 1 | |
| 7.08 | 2 |
Triglycerides (mg/dL)
Real number (ℝ)
Missing
| Distinct | 207 |
|---|---|
| Distinct (%) | 21.3% |
| Missing | 26 |
| Missing (%) | 2.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0413786 |
| Minimum | 0.22 |
|---|---|
| Maximum | 10.42 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 0.22 |
|---|---|
| 5-th percentile | 0.42 |
| Q1 | 0.64 |
| median | 0.85 |
| Q3 | 1.21 |
| 95-th percentile | 2.1545 |
| Maximum | 10.42 |
| Range | 10.2 |
| Interquartile range (IQR) | 0.57 |
Descriptive statistics
| Standard deviation | 0.74350213 |
|---|---|
| Coefficient of variation (CV) | 0.71395949 |
| Kurtosis | 38.239011 |
| Mean | 1.0413786 |
| Median Absolute Deviation (MAD) | 0.255 |
| Skewness | 4.6665268 |
| Sum | 1012.22 |
| Variance | 0.55279542 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.82 | 17 | 1.7% |
| 0.81 | 17 | 1.7% |
| 0.71 | 16 | 1.6% |
| 0.66 | 16 | 1.6% |
| 0.69 | 15 | 1.5% |
| 0.83 | 14 | 1.4% |
| 0.54 | 14 | 1.4% |
| 0.67 | 13 | 1.3% |
| 0.65 | 13 | 1.3% |
| 0.78 | 13 | 1.3% |
| Other values (197) | 824 | |
| (Missing) | 26 | 2.6% |
| Value | Count | Frequency (%) |
| 0.22 | 1 | 0.1% |
| 0.25 | 1 | 0.1% |
| 0.28 | 1 | 0.1% |
| 0.3 | 2 | 0.2% |
| 0.32 | 2 | 0.2% |
| 0.33 | 4 | |
| 0.34 | 4 | |
| 0.35 | 6 | |
| 0.36 | 2 | 0.2% |
| 0.37 | 5 |
| Value | Count | Frequency (%) |
| 10.42 | 1 | |
| 7.24 | 1 | |
| 5.6 | 2 | |
| 5.45 | 1 | |
| 5.24 | 1 | |
| 5.13 | 1 | |
| 5.01 | 1 | |
| 4.54 | 1 | |
| 4.4 | 1 | |
| 4.06 | 1 |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 58.5 KiB |
| 0.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 998 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1996 | |
| . | 998 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1996 | |
| Other Punctuation | 998 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1996 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2994 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1996 | |
| . | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2994 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1996 | |
| . | 998 |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 58.5 KiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 998 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 998 | |
| . | 998 | |
| 0 | 998 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1996 | |
| Other Punctuation | 998 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 998 | |
| 0 | 998 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 998 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2994 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 998 | |
| . | 998 | |
| 0 | 998 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2994 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 998 | |
| . | 998 | |
| 0 | 998 |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.8 KiB |
| False |
|---|
| Value | Count | Frequency (%) |
| False | 998 |
Interactions
Correlations
| Age (at enrolment) | BMI (kg/m²) | Sex | Triglycerides (mg/dL) | diastolic_bp_mmHg | height_m | systolic_bp_mmHg | total_cholesterol_mg_dL | weight_kg | |
|---|---|---|---|---|---|---|---|---|---|
| Age (at enrolment) | 1.000 | 0.120 | 0.138 | 0.045 | 0.010 | -0.052 | 0.200 | 0.091 | 0.118 |
| BMI (kg/m²) | 0.120 | 1.000 | 0.507 | 0.134 | 0.185 | -0.424 | 0.188 | 0.111 | 0.901 |
| Sex | 0.138 | 0.507 | 1.000 | 0.165 | 0.019 | 0.759 | 0.053 | 0.152 | 0.248 |
| Triglycerides (mg/dL) | 0.045 | 0.134 | 0.165 | 1.000 | 0.151 | 0.121 | 0.163 | 0.270 | 0.208 |
| diastolic_bp_mmHg | 0.010 | 0.185 | 0.019 | 0.151 | 1.000 | -0.001 | 0.797 | 0.083 | 0.198 |
| height_m | -0.052 | -0.424 | 0.759 | 0.121 | -0.001 | 1.000 | -0.002 | -0.108 | -0.021 |
| systolic_bp_mmHg | 0.200 | 0.188 | 0.053 | 0.163 | 0.797 | -0.002 | 1.000 | 0.084 | 0.199 |
| total_cholesterol_mg_dL | 0.091 | 0.111 | 0.152 | 0.270 | 0.083 | -0.108 | 0.084 | 1.000 | 0.075 |
| weight_kg | 0.118 | 0.901 | 0.248 | 0.208 | 0.198 | -0.021 | 0.199 | 0.075 | 1.000 |
Missing values
Sample
| anonymous_patient_id | Patient ID | study_source | primary_date | Age (at enrolment) | Sex | latitude | longitude | province | city | jhb_subregion | BMI (kg/m²) | weight_kg | height_m | systolic_bp_mmHg | diastolic_bp_mmHg | total_cholesterol_mg_dL | Triglycerides (mg/dL) | cd4_correction_applied | final_comprehensive_fix_applied | waist_circ_unit_correction_applied | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 985 | HEAT_EFCE0743072E | GSK1001 | JHB_DPHRU_053 | 2017-01-26 | 62.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 26.78 | 74.8 | 1.67 | 148.5 | 89.0 | 5.58 | 1.64 | 0.0 | 1.0 | False |
| 986 | HEAT_F3BA2B285DB1 | GSK1003 | JHB_DPHRU_053 | 2017-02-11 | 54.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 29.52 | 88.8 | 1.73 | 149.5 | 100.5 | 2.91 | 1.46 | 0.0 | 1.0 | False |
| 987 | HEAT_2B8BDBC0C1EE | GSK1004 | JHB_DPHRU_053 | 2017-01-23 | 62.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 17.77 | 59.4 | 1.83 | 154.0 | 93.0 | 4.04 | 0.50 | 0.0 | 1.0 | False |
| 988 | HEAT_E3EC25AD8189 | GSK1006 | JHB_DPHRU_053 | 2017-01-27 | 54.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 20.45 | 48.6 | 1.54 | 124.5 | 72.0 | 4.28 | 0.80 | 0.0 | 1.0 | False |
| 989 | HEAT_17FEBF78F855 | GSK1007 | JHB_DPHRU_053 | 2017-01-31 | 58.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 33.99 | 98.2 | 1.70 | 117.5 | 72.5 | 5.22 | 0.90 | 0.0 | 1.0 | False |
| 990 | HEAT_5FE7A2FC6A9C | GSK1008 | JHB_DPHRU_053 | 2017-01-31 | 58.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 23.46 | 72.8 | 1.76 | 134.0 | 84.0 | 4.62 | 0.73 | 0.0 | 1.0 | False |
| 991 | HEAT_0058EDDD14FA | GSK1010 | JHB_DPHRU_053 | 2017-02-15 | 60.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 21.40 | 58.4 | 1.65 | 139.0 | 86.0 | 2.87 | 5.24 | 0.0 | 1.0 | False |
| 992 | HEAT_A686658CD4F5 | GSK1011 | JHB_DPHRU_053 | 2017-02-03 | 59.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 20.78 | 62.2 | 1.73 | 104.0 | 75.0 | 3.07 | 1.31 | 0.0 | 1.0 | False |
| 993 | HEAT_C8D4CE31D3F4 | GSK1013 | JHB_DPHRU_053 | 2017-01-25 | 53.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 21.30 | 56.2 | 1.62 | 107.0 | 77.0 | 3.86 | 0.81 | 0.0 | 1.0 | False |
| 994 | HEAT_B8B546F9EE15 | GSK1014 | JHB_DPHRU_053 | 2017-03-15 | 59.0 | Female | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 26.00 | 80.6 | 1.76 | 139.5 | 88.5 | 5.56 | 1.30 | 0.0 | 1.0 | False |
| anonymous_patient_id | Patient ID | study_source | primary_date | Age (at enrolment) | Sex | latitude | longitude | province | city | jhb_subregion | BMI (kg/m²) | weight_kg | height_m | systolic_bp_mmHg | diastolic_bp_mmHg | total_cholesterol_mg_dL | Triglycerides (mg/dL) | cd4_correction_applied | final_comprehensive_fix_applied | waist_circ_unit_correction_applied | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1973 | HEAT_CDED5B8CE2A9 | GSK10133 | JHB_DPHRU_053 | 2018-03-20 | 54.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 35.84 | 92.9 | 1.61 | 119.0 | 94.5 | 5.83 | 0.91 | 0.0 | 1.0 | False |
| 1974 | HEAT_BFB34305F3B2 | GSK10134 | JHB_DPHRU_053 | 2018-04-24 | 60.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 27.86 | 74.6 | 1.64 | 130.0 | 88.5 | 4.61 | 0.73 | 0.0 | 1.0 | False |
| 1975 | HEAT_10E35CE1F4FB | GSK10135 | JHB_DPHRU_053 | 2018-03-01 | 53.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 35.00 | 90.5 | 1.61 | NaN | NaN | NaN | NaN | 0.0 | 1.0 | False |
| 1976 | HEAT_A84860D7AD5B | GSK10136 | JHB_DPHRU_053 | 2018-02-23 | 53.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 26.79 | 63.2 | 1.54 | 136.5 | 103.5 | 4.37 | 0.78 | 0.0 | 1.0 | False |
| 1977 | HEAT_6B92951F28B3 | GSK10137 | JHB_DPHRU_053 | 2018-02-27 | 54.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 39.20 | 106.6 | 1.65 | 117.5 | 82.5 | 3.44 | 0.81 | 0.0 | 1.0 | False |
| 1978 | HEAT_67AD7A632733 | GSK10138 | JHB_DPHRU_053 | 2017-07-05 | 46.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 26.31 | 62.9 | 1.55 | 154.0 | 111.5 | 4.24 | 0.47 | 0.0 | 1.0 | False |
| 1979 | HEAT_5EFCFB217CAD | GSK10139 | JHB_DPHRU_053 | 2018-03-07 | 57.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 36.79 | 96.0 | 1.62 | 146.5 | 96.0 | 4.41 | 1.55 | 0.0 | 1.0 | False |
| 1980 | HEAT_183B350F01FF | GSK10142 | JHB_DPHRU_053 | 2018-02-15 | 50.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 37.20 | 100.9 | 1.65 | 144.5 | 89.5 | 3.77 | 0.80 | 0.0 | 1.0 | False |
| 1981 | HEAT_E31645991B0D | GSK10143 | JHB_DPHRU_053 | 2017-11-23 | 43.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 34.99 | 86.1 | 1.57 | 131.5 | 86.5 | 4.56 | 0.50 | 0.0 | 1.0 | False |
| 1982 | HEAT_55C448A8437D | GSK10144 | JHB_DPHRU_053 | 2017-05-30 | 56.0 | Male | -26.2041 | 28.0473 | Gauteng | Johannesburg | Central_JHB | 33.87 | 84.7 | 1.58 | 160.5 | 99.5 | 6.75 | 1.12 | 0.0 | 1.0 | False |